SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes
نویسندگان
چکیده
The objective of this paper is 3D shape understanding from single and multiple images. To this end, we introduce a new deep-learning architecture and loss function, SilNet, that can handle multiple views in an order-agnostic manner. The architecture is fully convolutional, and for training we use a proxy task of silhouette prediction, rather than directly learning a mapping from 2D images to 3D shape as has been the target in most recent work. We demonstrate that with the SilNet architecture there is generalisation over the number of views – for example, SilNet trained on 2 views can be used with 3 or 4 views at test-time; and performance improves with more views. We introduce two new synthetics datasets: a blobby object dataset useful for pretraining, and a challenging and realistic sculpture dataset; and demonstrate on these datasets that SilNet has indeed learnt 3D shape. Finally, we show that SilNet exceeds the state of the art on the ShapeNet benchmark dataset [6] at generating silhouettes in new viewpoints, and we use SilNet to generate novel views of the sculpture dataset.
منابع مشابه
Silhouettes for Calibration and Reconstruction from Multiple Views
SUDIPTA N. SINHA: Silhouettes for Calibration and Reconstruction from Multiple Views. (Under the direction of Marc Pollefeys) In this thesis, we study how silhouettes extracted from images and video can help with two fundamental problems of 3D computer vision namely multi-view camera calibration and 3D surface reconstruction from multiple images. First, we present an automatic method for calibr...
متن کاملWide-Baseline Multi-View Video Segmentation For 3D Reconstruction
Obtaining a foreground silhouette across multiple views is one of the fundamental steps in 3D reconstruction. In this paper we present a novel video segmentation approach, to obtain a foreground silhouette, for scenes captured by a wide-baseline camera rig given a sparse manual interaction in a single view. The algorithm is based on trimap propagation, a framework used in video matting. Bayesia...
متن کاملSingle and sparse view 3D reconstruction by learning shape priors
In this paper, we aim to reconstruct free-form 3D models from only one or few silhouettes by learning the priorknowledge of a specific class of objects. Instead of heuristically proposing specific regularities and defining parametricmodels as previous research, our shape prior is learned directly from existing 3D models under a framework based onthe Gaussian Process Latent Variable ...
متن کاملShape from Silhouette Consensus
Many applications in computer vision require the 3D reconstruction of a shape from its different views. When the available information in the images is just a binary mask segmenting the object, the problem is called shape from silhouette (SfS). As first proposed by Baumgart [1], the shape is usually computed as the maximum volume consistent with the given set of silhouettes. This is called visu...
متن کاملCompressive Sensing for Background Subtraction
Compressive sensing (CS) is an emerging field that provides a framework for image recovery using sub-Nyquist sampling rates. The CS theory shows that a signal can be reconstructed from a small set of random projections, provided that the signal is sparse in some basis, e.g., wavelets. In this paper, we describe a method to directly recover background subtracted images using CS and discuss its a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.07888 شماره
صفحات -
تاریخ انتشار 2017